home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
C/C++ Users Group Library 1996 July
/
C-C++ Users Group Library July 1996.iso
/
vol_100
/
111_01
/
squeezer.doc
< prev
next >
Wrap
Text File
|
1985-08-19
|
24KB
|
598 lines
============
SQUEEZER.DOC
============
7/18/81
USAGE AND RECOMPILATION DOCUMENTATION FOR:
SQ.COM, Ver 1.3: File squeezer
USQ.COM, Ver 1.4: File unsqueezer
FLS.COM, Ver 1.1: Ambiguous file name expander
--------------------
DISTRIBUTION RIGHTS:
--------------------
I allow unrestricted non-profit distribution of this
software, and invite users' groups to spread it around.
However, any distribution for profit requires my permission
in advance. This applies only to the above listed programs
and their program source and documentation files. I do sell
other software.
--------
PURPOSE:
--------
The file squeezer, SQ, compresses files into a more compact
form. This provides:
1. Faster transmission by modem.
2. Fewer diskettes to distribute a program package.
(Include USQ.COM and instructions, both unsqueezed.)
3. Fewer diskettes for archival storage.
Any file can be squeezed, but program source files and text
files benefit the most, typically shrinking by 35%. Files
containing only a limited character set, such as dictionary
files, may shrink as much as 48%. Squeezed files look like
gibbersh and must be unsqueezed before they can be used.
The unsqueezer, USQ, expands squeezed files into exact
duplicates of the original or provides a quick, unsqueezed
display of the tops of (or all of) squeezed files.
Unsqueezing requires only a single pass.
Both SQ and USQ accept batches of work specified by lists of
file names (with drives if needed) and miscellaneous
options. They accept these parameters in any of three ways:
1. On the CP/M command line.
2. From the console keyboard.
3. From a file.
The FLS program can be used (on the same command line!) to
expand parameter lists containing wild-card (ambiguous) file
names into lists with the specific file names required by SQ
and USQ.
This combination of programs allows you to issue a single
command which will produce many squeezed or unsqueezed files
from and to various diskettes. For example, to unsqueeze all
squeezed ASM files on drive B and send the results to drive
C and also unsqueeze all squeezed TXT files on drive A and
send the results to drive D:
A>fls c: b:*.aqm d: *.tqt |usq
For detailed instructions see USAGE.
This DOES run under plain old vanilla CP/M! Many of the
smarts are buried in the COM files in the form of library
routines provided with the BDS C package (available from
Lifeboat).
The above example simulates a "pipe" (indicated by the "|")
by sending the "console" output of the fls.com program to a
temporary file and then running the sq.com program with
options which cause it to read its parameters from its
"console" input, which is really redirected to come from the
temporary file.
-------
THEORY:
-------
The data in the file is treated at the byte level rather
then the word level, and can contain absolutely anything.
The compression is in two stages: first repeated byte values
are compressed and then a Huffman code is dynamically
generated to match the properties of each particular file.
This requires two passes over the source data.
The decoding table is included in the squeezed file, so
squeezing short files can actually lengthen them. Fixed
decoding tables are not used because English and various
computer languages vary greatly as to upper and lower case
proportions and use of special characters. Much of the
savings comes from not assigning codes to unused byte
values.
More detailed comments are included in the source files.
---------------
USAGE TUTORIAL:
---------------
As usual, you have to learn how to tell the programs what to
do (i.e., what parameters to type after the program name).
First I will introduce the various possibilities by example.
Then I will summarize the rules.
In the simplest case either SQ or USQ can simply be given
one or more file names (with or without drive names):
A>sq xyz.asm
A>sq thisfile.doc b:thatfile.doc
will create squeezed files xyz.aqm, thisfile.dqc and
thatfile.dqc, all on the current drive, A. The original
files are not disturbed. Note that the names of the squeezed
files are generated by rules - you don't specify them.
Likewise,
A>usq xyz.aqm
will create file xyz.asm on the A drive, overwriting the
original. (The original name is recreated from information
stored in the squeezed version.) The squeezed version is not
disturbed.
Each file name is processed in order, and you can list all
the files you can fit in a command. The file names given to
SQ and USQ must be specific. You will learn below how to use
the FLS program to expand patterns like *.asm (all files of
type asm) into a list of specific names and feed them into
SQ or USQ.
The above examples let the destination drive default to the
current logged drive, which was shown in the prompt to be A.
You can change the destination drive as often as you like in
the parameter list. For example,
A>sq x.asm b: y.asm z.asm c: d:s.asm
will create x.aqm on the current drive, A, y.aqm and z.aqm
on the B drive and s.aqm on the C drive. Note that the first
three originals are on drive A and the last one is on drive
D. Remember that each parameter is processed in order, so
you must change the destination drive before you specify the
files to be created on that drive.
Eventually you will have diskettes with many squeezed files
on them and you will wonder what is in which file. If they
weren't squeezed you would use the TYPE command to look at
the comments at the beginning of the files. But squeezed
files just make a mess on your CRT screen when you TYPE
them, so I have provided the required feature as a preview
option to the USQ program.
A>usq -10 x.bas b:y.asm
will not take the time to create unsqueezed files. Instead
it will unsqueeze the first 10 lines of each file and
display them on your console. The display from each file
consists of the file names, the data and a formfeed (FF).
Also,
A>usq - c:xyz.mac
will unsqueeze and display the first 65,535 lines of any
files listed. That's the biggest number you can give it, and
is intended to display the whole file.
This preview option also ensures that the data is
displayable. The parity bit is stripped off (some Wordstar
files use it for format control) and any unusual control
characters are converted to periods. You'll see some of
these at the end of the files as the CP/M end of file is
treated as data and the remainder of the sector is
displayed.
You are now familiar with all of the operational parameters
of SQ and USQ. But so far you have always typed them on the
command line which caused the program to be run. For reasons
which will become apparent later, I have also provided an
interactive mode. If there are no parameters (except
directed i/o parameters, described later) on the command
line, SQ and USQ will prompt with an asterisk and accept
parameters from the console keyboard. Each parameter must be
followed by RETURN and will be processed immediately. An
empty command (just RETURN) will cause the program to exit
back to CP/M. Try it - it will help you understand what
follows.
Now lets get into directed i/o, which will be new to most of
you, but will save you so much work you will wonder how you
ever got along without it.
Perhaps you frequently squeeze or unsqueeze the same list of
files and you would like to type the list once and be done
with it. Use an editor (or FLS, described below) to create a
file with one parameter per line. For example call it
commands.lst.
Then,
A>sq <commands.lst
will cause the command list file to be read as if you were
typing it! You will see it on the console.
That was redirected console input. Now assume that you have
a very long l